Using Multiple Central Processing Unit Cores for Packet Forwarding in Virtualized Networks

ABSTRACT

Systems and methods for using a plurality of processing cores for packet processing in a virtualized network environment are described herein. An example system can comprise a scheduler operable to initiate a processing core of the plurality of processing cores. The processing core is operable to process a plurality of data packets. Based on the determination that the processing core exceeds a threshold processing capacity associated with the processing core, the scheduler sequentially initiates at least one subsequent processing core. The at least one subsequent processing core has a corresponding threshold processing capacity and is operable to process data packets of the plurality of data packets in excess of threshold processing capacities associated with preceding processing cores. Thus, the threshold processing capacities associated with the preceding processing cores are not exceeded.

CROSS REFERENCE TO RELATED APPLICATIONS

This non-provisional patent application claims priority benefit of, and is a continuation of, U.S. patent application Ser. No. 14/828,351, filed on Aug. 17, 2015, entitled “Using Multiple Central Processing Unit Cores for Packet Forwarding in Virtualized Networks,” which is hereby incorporated by reference herein in its entirety including all references cited therein.

TECHNICAL FIELD

The present disclosure relates generally to data processing and, more specifically, to methods and systems for optimizing usage of multiple processing units in virtualized environments.

BACKGROUND

The approaches described in this section could be pursued but are not necessarily approaches that have previously been conceived or pursued. Therefore, unless otherwise indicated, it should not be assumed that any of the approaches described in this section qualify as prior art merely by virtue of their inclusion in this section.

Data centers are conventionally used to house computer systems and associated components (for example, telecommunications and storage systems). A data center generally includes a set of redundant devices, such as power supplies and data communications connections, environmental controls, and security devices. Trends in global technological development bring transformations to technical solutions related to organization of data centers. Such transformations relate to standardization, virtualization, automation, and security of the data centers. Virtualization technologies can be used to replace or consolidate multiple data center equipment. Virtualization may help to lower capital and operational expenses and reduce energy consumption.

Virtualized data centers are popular due to their ability to enable multiple virtual machines to share the same hardware in an over-subscribed way, because the usage of processing units (such as central processing units (CPUs)) and memory of virtual machines may not be in need of peak capacity at the same time. However, CPU utilization associated with traditional packet forwarding in a physical networking environment does not align with the needs of virtualized networks. In a physical networking environment, since the CPUs are entirely owned and utilized for network or security purposes, the CPU utilization practices need to utilize a run-to-complete model to use the CPU all the time, even in a low load scenario. Furthermore, tasks need to be divided to be processed in multiple CPUs at the same time. Thus, existing environments are not optimized for virtualized networks as they consume all available CPU resources regardless of the load. Additionally, constant running of high-priority tasks on all available CPUs prevents other applications or virtual machines from using the CPUs.

SUMMARY

This summary is provided to introduce a selection of concepts in a simplified form that are further described below in the Detailed Description. This summary is not intended to identify key features or essential features of the claimed subject matter, nor is it intended to be used as an aid in determining the scope of the claimed subject matter.

Provided are systems and methods for using a plurality of processing cores for packet processing in a virtualized network environment. An example method may commence with initiating a first processing core of the plurality of processing cores. The first processing core may be operable to process a first plurality of data packets. Furthermore, the method may include receiving a first data packet. The method may further include determining that the first processing core exceeds a first threshold processing capacity associated with the first processing core. Based on the determination, a subsequent processing core may be initiated. The subsequent processing core may have a second threshold processing capacity. Additionally, the subsequent processing core may be operable to process data packets of the first plurality of data packets in excess of the first threshold processing capacity associated with the first processing core. The method may then include forwarding the first data packet to the subsequent processing core.

Also provided is a system for using a plurality of processing cores for packet processing in a virtualized network environment. The system may comprise a scheduler. The scheduler may be operable to initiate a first processing core of the plurality of processing cores. The first processing core may be operable to process a first plurality of data packets. Furthermore, the scheduler may receive a first data packet. The scheduler may further determine that the first processing core exceeds a first threshold processing capacity associated with the first processing core. Based on the determination, a subsequent processing core may be initiated. The subsequent processing core may have a second threshold processing capacity and be operable to process data packets of the first plurality of data packets in excess of the first threshold processing capacity associated with first processing core. The scheduler may then forward the first data packet to the subsequent processing core.

Also provided is a non-transitory computer readable storage medium having a program embodied thereon, the program being executable by a processor to perform a method for packet processing in a virtualized network environment. The method may commence with initiating a first processing core of the plurality of processing cores. The first processing core may be operable to process a first plurality of data packets. Furthermore, the method may include receiving a first data packet. The method may further include determining that the first processing core exceeds a first threshold processing capacity associated with the first processing core. Based on the determination, a subsequent processing core may be initiated. The subsequent processing core may have a second threshold processing capacity. Additionally, the subsequent processing core may be operable to process data packets of the first plurality of data packets in excess of the first threshold processing capacity associated with the first processing core. The method may then include forwarding the first data packet to the subsequent processing core.

In further exemplary embodiments, modules, subsystems, or devices can be adapted to perform the recited steps. Other features and exemplary embodiments are described below.

BRIEF DESCRIPTION OF THE DRAWINGS

Embodiments are illustrated by way of example and not limitation in the figures of the accompanying drawings, in which like references indicate similar elements.

FIG. 1 illustrates an environment within which systems and methods for using a plurality of processing cores for packet processing in a virtualized network environment can be implemented, in accordance with some embodiments.

FIG. 2 shows schematic diagrams for packet processing in various network environments, in accordance with certain embodiments.

FIG. 3 is a flow chart illustrating a method for using a plurality of processing cores for packet processing in a virtualized network environment, in accordance with some example embodiments.

FIG. 4 is a block diagram showing various modules of a system for using a plurality of processing cores for packet processing in a virtualized network environment, in accordance with certain embodiments.

FIG. 5 shows a schematic diagram of data packet distribution within a virtual machine in a virtualized network environment, in accordance with an example embodiment.

FIG. 6 shows a diagrammatic representation of a computing device for a machine in the exemplary electronic form of a computer system, within which a set of instructions for causing the machine to perform any one or more of the methodologies discussed herein can be executed.

DETAILED DESCRIPTION

The following detailed description includes references to the accompanying drawings, which form a part of the detailed description. The drawings show illustrations in accordance with exemplary embodiments. These exemplary embodiments, which are also referred to herein as “examples,” are described in enough detail to enable those skilled in the art to practice the present subject matter. The embodiments can be combined, other embodiments can be utilized, or structural, logical, and electrical changes can be made without departing from the scope of what is claimed. The following detailed description is, therefore, not to be taken in a limiting sense, and the scope is defined by the appended claims and their equivalents. In this document, the terms “a” and “an” are used, as is common in patent documents, to include one or more than one. In this document, the term “or” is used to refer to a nonexclusive “or,” such that “A or B” includes “A but not B,” “B but not A,” and “A and B,” unless otherwise indicated.

This disclosure provides methods and systems for using a plurality of processing cores for packet processing in a virtualized network environment. Network virtualization includes combining hardware and software network resources and network functionality into a single, software-based virtual network. Network virtualization brings the need to support switching between different virtual machines within the same network. The virtualized network environment may be controlled by a virtual network interface card (also referred to as a virtual network interface controller). More specifically, each virtual machine may instantiate the virtual network interface card to connect to a software driver, also referred to as a driver or a scheduler, enabling sending and receiving network traffic. The driver is a system component (such as a software component) that runs in a kernel mode (i.e., an unrestricted mode) of a processing unit, such as a CPU, and has access to core operating system data. The driver is provided to gain access to data that is available only in kernel mode of the processing unit.

Virtualized networks may be developed as multi-core systems, i.e., systems having multi-core processors. A multi-core processor is a computing component with two or more independent processing units (called “processing cores” or “cores”), which read and execute program instructions.

Thus, the present disclosure describes optimizing packet processing when using multiple processing cores in a virtualized network. Most multi-core systems used in conventional physical environments have a packet-receiving queue for each processing core. Each processing core may be operable to perform a plurality of tasks, such as data packet forwarding (to other components of the network), data packet processing, and so forth. The driver may forward data packets to all the packet-receiving queues evenly, regardless of how many data packets are waiting to be processed. Such scheduling of the data packets may cause extra expenses and waste resources of the processing cores, as the scheduling assumes all process cores are fully allocated to the driver regarding processing loading. In this case, the resources of the processing core are utilized to pull packets from queue and process packet forwarding (which requires a real-time response), without yielding the processing cores to other tasks or virtual machines.

Utilizing processing cores as described in the present disclosure can minimize context switching (or switching between different activities) for the processing cores and release available resources of the processing cores to other virtual machines, thereby maximizing the processing core utilization. More specifically, a method described herein utilizes a dynamic queue distribution algorithm to distribute data packets depending on the workload in the virtualized network environment. The virtualized network environment may have a plurality of processing cores. The total number of the processing cores may be a predetermined number of processing cores with the total capacity satisfying the network workload requirements. The dynamic queue distribution algorithm includes forwarding all incoming data packets to a first processing core of the plurality of processing cores. When the traffic volume of data packets increases beyond a certain threshold, a second processing core may be initiated; specifically, the incoming data packets may be forwarded to the first processing core and the second processing core. Therefore, more traffic volume may trigger forwarding the data packets to an additional processing core, one processing core at a time. Eventually, in a fully loaded scenario, the data packets may be forwarded to all available processing cores.

When the traffic volume reduces, the dynamic queue distribution algorithm may change to follow the workload. If the traffic volume reduces to a certain threshold, the data packets are no longer forwarded to the last initiated processing core. Eventually, the last initiated processing core does not have any data packets to process, and the resources of the last initiated processing core can be released to other virtual machines.

By consolidating the packet processing to a limited number of processing cores instead of distributing packets evenly to all available processing cores, the dynamic queue distribution algorithm can balance real-time packet processing and general-purpose processing, reduce the change of context switching of processing cores, and increase a cache hit rate (represented as a percentage of requests that are served from a cache of a server for retrieving the desired data from the cache). It allows more efficiency processing cores sharing among virtual machines. This improves overall processing performance.

FIG. 1 illustrates an environment 100 within which systems and methods for using a plurality of processing cores for packet processing in a virtualized network environment can be implemented, in accordance with some embodiments. The environment 100 may include a virtualized environment in a distributed network (not shown), in which incoming network traffic shown as data packets 105 is forwarded to processing units (shown as processing unit A 110, processing unit B 115, and processing unit C 120) of a virtual machine 125. The processing units 110, 115, and 120 may include CPUs.

The environment 100 may further include a software driver shown as a scheduler 130 responsible for receiving the data packets 105 and scheduling the data packets 105 to one or more of the processing units 110, 115, and 120.

FIG. 2 shows schematic diagrams for packet processing, according to example embodiments. Diagram 200 shows data packets 202 received by a scheduler (not shown). An arrow 204 shows an order in which the data packets are forwarded to the scheduler, where a data packet 206 is received by the scheduler first, a data packet 208 is received by the scheduler second, and a data packet 210 is received by the scheduler third.

Diagram 220 schematically shows conventional scheduling of data packets in physical environments. A physical environment (not shown) may have a plurality of processing cores, shown as a processing core A 222, a processing core B 224, and a processing core C 226. A scheduler (not shown) may schedule data packets 228 to each of the processing core A 222, the processing core B 224, and the processing core C 226 in parallel. More specifically, a first data packet 230 (being the data packet received by the scheduler first) may be scheduled to the processing core A 222, a second data packet 232 (being the data packet received by the scheduler second) may be scheduled to the processing core B 224, and a third data packet 234 (being the data packet received by the scheduler third) may be scheduled to the processing core C 226. Subsequently received data packets may be sequentially scheduled to the processing core A 222, the processing core B 224, and the processing core C 226. For example, a fourth data packet 236 (being the data packet received forth by the scheduler) may be scheduled to the processing core A 222, and so forth.

An arrow 238 shows a packet queue for each of the processing core A 222, the processing core B 224, and the processing core C 226. In an example embodiment, the packet queue associated with the processing core A 222 may include data packets received first, forth, and seventh by the scheduler. Similarly, the packet queue associated with the processing core B 224 may include data packets received second, fifth, and eighth by the scheduler, while the packet queue associated with the processing core C 226 may include data packets received third and sixth by the scheduler.

Diagram 240 schematically shows scheduling data packets in virtualized network environments. A virtualized network environment (not shown) may have a plurality of processing cores, shown as a processing core A 242, a processing core B 244, and a processing core C 246. A scheduler (not shown) may schedule data packets 248 to each of the processing core A 242, the processing core B 244, and the processing core C 246 according to processing capacity of each of the processing core A 242, the processing core B 244, and the processing core C 246. More specifically, a first data packet 250 (being the data packet received first by the scheduler) may be scheduled to the processing core A 242, a second data packet 252 (being the data packet received second by the scheduler) may be also scheduled to the processing core A 242, as well as a third data packet 254 (being the data packet received third by the scheduler) may be scheduled to the processing core A 242. Upon forwarding the data packet 254 to the processing core A 242, the scheduler may determine that the processing core A 242 exceeds a threshold processing capacity associated with the processing core A 242. Therefore, the schedule may schedule subsequently received data packets to the processing core B 244. For example, a data packet 256, a data packet 258, and a data packet 260 may be scheduled to the processing core B 244. Upon forwarding the data packet 260 to the processing core B 244, the scheduler may determine that the processing core B 244 exceeds a threshold processing capacity associated with the processing core B 244. Therefore, the scheduler may schedule subsequently received data packets to the processing core C 246, such as a data packet 262 and a data packet 264.

The threshold processing capacity may be different for each of the processing core A 242, the processing core B 244, and the processing core C 246 depending on the purpose of each of the processing core A 242, the processing core B 244, and the processing core C 246. For example, the processing core A 242 and the processing core B 244 may be dedicated for data packet forwarding, while the processing core C 246 may be dedicated to data packet processing. Therefore, the threshold processing capacity for the processing core A 242 and the processing core B 244 may be higher than the threshold processing capacity for the processing core C 246. The processing core C 246 may use a bigger portion of its processing capacity, for example 70 percent, for data packet processing, while a smaller portion of its processing capacity, for example, 30 percent, may be used for data packet forwarding.

An arrow 262 shows a packet queue for each of the processing core A 242, the processing core B 244, and the processing core C 246. In an example embodiment, the packet queue associated with the processing core A 242 may include data packets received first, second, and third by the scheduler. Similarly, the packet queue associated with the processing core B 244 may include data packets received fourth, fifth, and sixth by the scheduler, while the packet queue associated with the processing core C 246 may include data packets received seventh and eighth by the scheduler.

In an example embodiment, the scheduler may send to the first processing core A 242 an amount of the data packets 105 corresponding to a threshold processing capacity of the first processing core A 242. All subsequent data packets 105 may be forwarded to the processing core B 244.

According to another example embodiment, upon determining that the first processing core A 242 exceeds the threshold processing capacity, the scheduler may forward all received data packets 105 to the processing core A 242 and the processing core B 244 by distributing the data packets evenly between the processing core A 242 and the processing core B 244.

Thus, a processing core-based scaling of data packet forwarding may be used in the virtualized network environment. Moreover, in the virtualized network environment, each of the processing core A 242, the processing core B 244, and the processing core C 246 may serve a shared resource (i.e., the processing capacity of each of the processing core A 242, the processing core B 244, and the processing core C 246 may be shared between a plurality of virtual machines). Therefore, sequential distribution of the data packets may allow for not loading all available processing cores, but freeing up processing capacities of the processing cores, which do not currently participate in packet forwarding, to other virtual machines.

FIG. 3 is a flow chart illustrating a method 300 for using a plurality of processing cores for packet processing in a virtualized network environment, in accordance with some example embodiments. The method 300 may commence with initiating a processing core of the plurality of processing cores at operation 302. The processing core may be operable to process a plurality of data packets. The method 300 may continue with determining that the processing core exceeds a threshold processing capacity associated with the processing core at operation 304. The method 300 may further include sequentially initiating at least one subsequent processing core at operation 306. The sequential initiation of the at least one subsequent processing core may be based on the determining that the processing core exceeds a threshold processing capacity associated with the processing core. The at least one subsequent processing core may have a corresponding threshold processing capacity. The at least one subsequent processing core may be operable to process data packets of the plurality of data packets in excess of threshold processing capacities associated with preceding processing cores. Therefore, it may be ensured that the threshold processing capacities associated with the preceding processing cores are not exceeded. In an example embodiment, the at least one subsequent processing core may be operable to process the plurality of data packets until the subsequent processing core exceeds the corresponding threshold processing capacity.

In an example embodiment, the method 300 may optionally include determining that the processing core is below the threshold processing capacity at operation 308. At optional operation 310, the at least one subsequent processing core may be excluded from processing the data packets. The at least one subsequent processing core may include a last initiated subsequent processing core.

In a further example embodiment, the processing core and all sequentially initiated processing cores including the last initiated subsequent processing core may be associated with a first virtual machine. Based on the excluding the last initiated subsequent processing core from processing of the data packets, the last initiated subsequent processing core may be assigned to a second virtual machine. Therefore, processing resources of the last initiated subsequent processing core may be used for the purposes of the second virtual machine.

In an example embodiment, the method 300 may further include determining a number of the data packets forwarded by each of subsequent processing cores per unit of time with respect to processing capacities, while each of the subsequent processing cores may have a certain processing capacity. Thereby, a current workload of each of the subsequent processing cores may be determined. The threshold processing capacity may constitute a predetermined portion of the total capacity of the processing core.

In some example embodiments, the data packets sent to the processing core and each of subsequent processing cores may be placed into a plurality of packet queues. The processing core and each of the subsequent processing cores may be associated with one packet queue of the plurality of packet queues. The method 300 may further include monitoring a number of the data packets in each of the plurality of packet queues. Additionally, the method 300 may include ascertaining that the number of the data packets in a packet queue associated with the processing core decreases for a lesser number of data packets than a number of data packets placed into the packet queue per a unit of time. In such a case, determination that the processing core exceeds the threshold processing capacity mentioned with respect to operation 302 may be based on such ascertaining.

FIG. 4 is a block diagram showing various modules of a system 400 for using a plurality of processing cores for packet processing in a virtualized network environment, in accordance with certain embodiments. The system 400 may comprise a scheduler 410 and an optional database 420. In various embodiments, the system 400 may reside outside of the organization in a virtualized data center outside control of the organization or be provided as a cloud service.

The scheduler 410 may act as a driver enabling sending and receiving of network traffic. The scheduler 410 may be operable to initiate a processing core of the plurality of processing cores. The processing core may be operable to process a plurality of data packets. The scheduler 410 may be further operable to determine that the processing core exceeds a threshold processing capacity associated with the processing core. Based on the determining, scheduler 410 may be operable to sequentially initiate at least one subsequent processing core. The at least one subsequent processing core may have a corresponding threshold processing capacity. The at least one subsequent processing core may be operable to process data packets of the plurality of data packets in excess of threshold processing capacities associated with preceding processing cores to ensure that the threshold processing capacities associated with the preceding processing cores are not exceeded. In an example embodiment, the at least one subsequent processing core may be operable to process the plurality of data packets until the subsequent processing core exceeds the corresponding threshold processing capacity. In some embodiments, the threshold processing capacity may constitute a predetermined portion of a total capacity of the processing core.

In further example embodiments, the scheduler 410 may be further operable to determine that the processing core is below the threshold processing capacity. Based on the determining, the scheduler 410 may be operable to exclude the at least one subsequent processing core from processing the data packets. The at least one subsequent processing core may include the last initiated subsequent processing core.

In some embodiments, the processing core and the last initiated subsequent processing core may be associated with a first virtual machine. In such an embodiment, the scheduler 410 may be further operable to assign, based on the excluding, the last initiated subsequent processing core to a second virtual machine.

Each of the subsequent processing cores have a certain processing capacity. The scheduler 410 may be operable to determine a number of the data packets forwarded by each of subsequent processing cores per unit of time with respect to processing capacities to determine a current workload of each of the subsequent processing cores.

The data packets sent to the processing core and each of subsequent processing cores may be placed into a plurality of packet queues. Each of the processing core and the subsequent processing cores may be associated with one of the plurality of packet queues. The scheduler 410 may be further operable to monitor a number of the data packets in the each of the plurality of packet queues. Each of the processing core and the subsequent processing cores may be operable to pull the data packets from one of the plurality of packet queues. Pulling of the data packets from the packet queue may be performed, for example, by sending a signal by the processing core indicating that the processing core is available for receiving a further data packet.

Additionally, the scheduler 410 may ascertain that the number of data packets in a packet queue associated with the processing core decreases for a lesser number of data packets than a number of data packets placed into the packet queue per a unit of time. Based on such ascertaining, it may be determined that the processing core exceeds the threshold processing capacity.

In a further example embodiment, the scheduler 410 may determine that a number of data packets placed into a packet queue associated with a processing core does not change within a predetermined unit of time. The number of data packets placed into the packet queue may not change in case of a failure of the processing core or in case the processing core is busy with processing a high-priority task. Therefore, in case of such determination, the scheduler 410 may forward further data packets to a subsequent processing core. Additionally, the data packets, which were placed into the packet queue that does not change, may be resent to the subsequent processing core.

According to another example embodiment, a processing core may receive a high-priority task. The processing core may send a signal to the scheduler 410 informing that the processing core is no longer available for data packet forwarding. Upon receiving such a signal, the scheduler 410 may send further data packets to a subsequent processing core.

Additionally, a security policy of the virtualized network environment may change, and a processing core, which forwards data packets associated with a first virtual machine, may be assigned to a second virtual machine. The processing core may send a signal to the scheduler 410 informing that the processing core is no longer available for data packet forwarding. Upon receiving such signal, the scheduler 410 may send further data packets to a subsequent processing core.

The optional database 420 may store computer-readable instructions for execution by the scheduler 410.

FIG. 5 shows a schematic diagram 500 of data packet distribution within a virtual machine 505 in a virtualized network environment. The virtual machine 505 may be controlled by a virtual network interface card 510. More specifically, the virtual network interface card 510 may be operable to connect to a scheduler 515 (being a software driver enabling sending and receiving network traffic).

The scheduler 515 may forward received data packets 520 to a first processing core shown as a processing core A 525 until the processing core A 525 exceeds a predetermined threshold capacity associated with the processing core A 525. Data packets 530 received after determining that the processing core A 525 exceeds the predetermined threshold capacity may be forwarded to a second processing core shown as a processing core B 535. The data packets 520 forwarded to the processing core A 525 may be placed to a packet queue 550 and the data packets 530 forwarded to the processing core B 535 may be placed to a packet queue 555. Therefore, the scheduler 515 may be operable to send a signal 545 to the processing core B 535 instructing to start pulling data packets from the packet queue 555. Upon receiving the signal 545, the processing core B 535 may start pulling the data packets 530.

In case the processing core B 535 does not exceed a predetermined threshold capacity associated with the processing core B 535, the scheduler 515 may not forward the data packets to a third processing core shown as a processing core C 540. In such case, a packet queue 560 associated with the processing core C 540 may be empty.

In general, the scheduler 515 may be operable to send the signal 545 to each of the processing core A 525, the processing core B 535, and the processing core C 540 instructing to start pulling data packets from the packet queue 550, the packet queue 555, or the packet queue 560, respectively.

In some embodiments, the scheduler 515 may make a decision to forward data packets to the processing core B 535 when the packet queue 550 of the processing core A 525 reaches a predetermined amount (for example, the packet queue 550 reaches 90 percent).

FIG. 6 shows a diagrammatic representation of a computing device for a machine in the exemplary electronic form of a computer system 600, within which a set of instructions for causing the machine to perform any one or more of the methodologies discussed herein can be executed. In various exemplary embodiments, the machine operates as a standalone device or can be connected (e.g., networked) to other machines. In a networked deployment, the machine can operate in the capacity of a server or a client machine in a server-client network environment, or as a peer machine in a peer-to-peer (or distributed) network environment. The machine can be a server, a personal computer (PC), a tablet PC, a set-top box (STB), a cellular telephone, a digital camera, a portable music player (e.g., a portable hard drive audio device, such as an Moving Picture Experts Group Audio Layer 3 (MP3) player), a web appliance, a network router, a switch, a bridge, or any machine capable of executing a set of instructions (sequential or otherwise) that specify actions to be taken by that machine. Further, while only a single machine is illustrated, the term “machine” shall also be taken to include any collection of machines that individually or jointly execute a set (or multiple sets) of instructions to perform any one or more of the methodologies discussed herein.

The example computer system 600 includes a processor or multiple processors 602, a hard disk drive 604, a main memory 606, and a static memory 608, which communicate with each other via a bus 610. The computer system 600 may also include a network interface device 612. The hard disk drive 604 may include a computer-readable medium 620, which stores one or more sets of instructions 622 embodying or utilized by any one or more of the methodologies or functions described herein. The instructions 622 can also reside, completely or at least partially, within the main memory 606 and/or within the processors 602 during execution thereof by the computer system 600. The main memory 606 and the processors 602 also constitute machine-readable media.

While the computer-readable medium 620 is shown in an exemplary embodiment to be a single medium, the term “computer-readable medium” should be taken to include a single medium or multiple media (e.g., a centralized or distributed database, and/or associated caches and servers) that store the one or more sets of instructions. The term “computer-readable medium” shall also be taken to include any medium that is capable of storing, encoding, or carrying a set of instructions for execution by the machine and that causes the machine to perform any one or more of the methodologies of the present application, or that is capable of storing, encoding, or carrying data structures utilized by or associated with such a set of instructions. The term “computer-readable medium” shall accordingly be taken to include, but not be limited to, solid-state memories, optical and magnetic media. Such media can also include, without limitation, hard disks, floppy disks, NAND or NOR flash memory, digital video disks, RAM, ROM, and the like.

The exemplary embodiments described herein can be implemented in an operating environment comprising computer-executable instructions (e.g., software) installed on a computer, in hardware, or in a combination of software and hardware. The computer-executable instructions can be written in a computer programming language or can be embodied in firmware logic. If written in a programming language conforming to a recognized standard, such instructions can be executed on a variety of hardware platforms and for interfaces to a variety of operating systems. Although not limited thereto, computer software programs for implementing the present method can be written in any number of suitable programming languages such as, for example, C, Python, JavaScript, Go, or other compilers, assemblers, interpreters or other computer languages or platforms.

Thus, systems and methods for using a plurality of processing cores for packet processing in a virtualized network environment are described. Although embodiments have been described with reference to specific exemplary embodiments, it will be evident that various modifications and changes can be made to these exemplary embodiments without departing from the broader spirit and scope of the present application. Accordingly, the specification and drawings are to be regarded in an illustrative rather than a restrictive sense. 

What is claimed is:
 1. A system for using a plurality of processing cores for packet processing in a virtualized network environment, the system comprising: a computer-implemented scheduler operable to: initiate a first processing core of the plurality of processing cores such that the first processing core is operable to process a first plurality of data packets; receive a first data packet; determine that the first processing core exceeds a first threshold processing capacity associated with the first processing core; based on the determining, initiate a subsequent processing core such that the subsequent processing core is operable to process data packets of the first plurality of data packets in excess of the first threshold processing capacity, the subsequent processing core having a second threshold processing capacity; and forward the first data packet to the subsequent processing core.
 2. The system of claim 1, wherein the computer-implemented schedule is further operable to: receive a second data packet; determine that a last initiated processing core exceeds a last initiated threshold processing capacity associated with the last initiated processing core; based on the determining, initiate a further subsequent processing core such that the further subsequent processing core is operable to process data packets of the first plurality of data packets in excess of the last initiated threshold processing capacity, the further subsequent processing core having a third threshold processing capacity; and forward the second data packet to the further subsequent processing core.
 3. The system of claim 1, wherein the computer-implemented scheduler is further operable to: receive a second data packet; ascertain that the first processing core is below the first threshold processing capacity; based on the ascertaining, re-initiate a last initiated processing core such that the last initiated processing core is operable to process a second plurality of data packets; and forward the second data packet to the first processing core.
 4. The system of claim 3, wherein the first plurality of data packets is associated with a first virtual machine and the second plurality of data packets is associated with a second virtual machine.
 5. The system of claim 1, wherein the computer-implemented scheduler is further operable to: determine a number of data packets forwarded by the subsequent processing core per unit of time with respect to processing capacities; and determine a current workload of the subsequent processing core based on the determination of the number of data packets forwarded, wherein the subsequent processing core has a processing capacity.
 6. The system of claim 1, wherein the first threshold processing capacity constitutes a predetermined portion of a total capacity of the first processing core.
 7. The system of claim 1, wherein the data packets sent to the processing core and the subsequent processing core are placed into a plurality of packet queues, wherein the first processing core is associated with a first packet queue and the subsequent processing core is associated with a second packet queue, the first data packet being forwarded to the first packet queue associated with the first processing core.
 8. The system of claim 7, wherein the scheduler is further operable to monitor a number of data packets in the each of the first and the second packet queues.
 9. The system of claim 8, wherein the scheduler is further operable to: ascertain that a number of data packets in the first packet queue associated with the first processing core is less than a number of data packets placed into the first packet queue per a unit of time, wherein the determining that the first processing core exceeds the first threshold processing capacity is based on the ascertaining.
 10. The system of claim 7, wherein the first processing core is operable to pull the data packets from the first packet queue and the subsequent processing core is operable to pull the data packets from of the second packet queue.
 11. A method for using a plurality of processing cores for packet processing in a virtualized network environment, the method comprising: initiating a first processing core of the plurality of processing cores, such that the first processing core is operable to process a first plurality of data packets; receiving a first data packet; determining that the first processing core exceeds a first threshold processing capacity associated with the first processing core; based on the determining, initiating a subsequent processing core such that the subsequent processing core is operable to process data packets of the first plurality of data packets in excess of the first threshold processing capacity, the subsequent processing core having a second threshold processing capacity; and forwarding the first data packet to the subsequent processing core.
 12. The method of claim 11, further comprising: receiving a second data packet; determining that a last initiated processing core exceeds a last initiated threshold processing capacity associated with the last initiated processing core; based on the determining, initiating a further subsequent processing core such that the further subsequent processing core is operable to process data packets of the first plurality of data packets in excess of the last initiated threshold processing capacity, the further subsequent processing core having a third threshold processing capacity; and forwarding the second data packet to the further subsequent processing core.
 13. The method of claim 11, further comprising: receiving a second data packet; ascertaining that the first processing core is below the first threshold processing capacity; based on the ascertaining, re-initiating a last initiated processing core such that the last initiated processing core is operable to process a second plurality of data packets; and forwarding the second data packet to the first processing core.
 14. The method of claim 13, wherein the first plurality of data packets is associated with a first virtual machine and the second plurality of data packets is associated with a second virtual machine.
 15. The method of claim 11, further comprising: determining a number of data packets forwarded by the subsequent processing core per unit of time with respect to processing capacities; and determining a current workload of the subsequent processing core based on the determination of the number of data packets forwarded, wherein the subsequent processing core has a processing capacity.
 16. The method of claim 15, wherein the first threshold processing capacity constitutes a predetermined portion of a total capacity of the first processing core.
 17. The method of claim 11, wherein the data packets sent to the processing core and the subsequent processing core are placed into a plurality of packet queues, wherein the first packet queue is associated with the first processing core and the second packet queue is associated with the second processing core; the first data packet being forwarded to the first packet queue associated with the first processing core.
 18. The method of claim 17, further comprising: monitoring a number of data packets in the each of the first and second packet queues.
 19. The method of claim 18, further comprising: ascertaining that a number of the data packets in the first packet queue associated with the first processing core decreases for a lesser number of data packets than a number of data packets placed into the first packet queue per a unit of time, wherein the determining that the first processing core exceeds the first threshold processing capacity is based on the ascertaining.
 20. A non-transitory computer readable storage medium having a program embodied thereon, the program being executable by a processor to perform a method for packet processing in a virtualized network environment, the method comprising: initiating a first processing core of the plurality of processing cores such that the first processing core is operable to process a first plurality of data packets; receiving a first data packet; determining that the first processing core exceeds a first threshold processing capacity associated with the first processing core; based on the determining, initiating a subsequent processing core such that the subsequent processing core is operable to process data packets of the first plurality of data packets in excess of the first threshold processing capacity, the subsequent processing core having a second threshold processing capacity; and forwarding the first data packet to the subsequent processing core. 